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Abstract 

For over a century, the study of logic has fo- 
cused on the algebra of logical statements. 
This work, first performed by George Boole, 
has led to the development of modem com- 
puters, and was shown by Richard T. Cox 
to be the foundation of Bayesian inference. 
Meanwhile the logic of questions has been 
much neglected. For our computing machines 
to be truly intelligent, they need to be able 
to ask relevant questions. In this paper I will 
show how the Boolean lattice of logical state- 
ments gives rise to the free distributive lat- 
tice of questions thus defining their algebra. 
Furthermore, there exists a quantity analo- 
gous to probability, called relevance, which 
quantifies the degree to which one question 
answers another. I will show that relevance 
is not only a natural generalization of infor- 
mation theory, but also forms its foundation. 

1 INTRODUCTION 

Intelligent machines need to actively acquire informar- 
tion, and the act of asking questions is central to 
this capability. Question-asking comes in many forms 
ranging from the simplest where an instrument con- 
tinuously monitors data from a sensor, to the more 
complex where a rover must decide which instrument 
to deploy or measurement to take, and even the more 
human-like where a robot must verbally request infor- 
mation from an astronaut during in an in-orbit con- 
struction task. 

Intelligence is not just about providing the correct so- 
lution to a problem. When vital information is lacking, 
intelligence is required to formulate relevant questions. 
For over 150 years mathematicians have studied the 
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logic of statements (Boole, 1854); whereas the mathe- 
matics of questions has been almost entirely neglected. 
In this paper, I will describe recent work performed in 
understanding the algebra of questions, and its associ- 
ated calculus, the inquiry calculus. 

Much of the material presented in this paper relies on 
the mathematics of partially-ordered sets and lattices. 
For this reason, I have included a short appendix to 
which the reader can refer for some of the mathemati- 
cal background. Section §2 briefly discusses questions 
and the motivation for this work. Section §3 introduces 
the formal definition of a question. The lattice of ques- 
tions and its associated algebra is described in Section 
§4. The question algebra is extended to the inquiry 
calculus in Section §5 by introducing a bi- valuation 
called relevance, which quantifies the degree to which 
one question answers another. In section §6, I show 
that the inquiry calculus is not only a natural general- 
ization of information theory, but also forms its foun- 
dation. Section §7 summarizes the results, discusses 
how information theory has been used for some time to 
address question-asking, and describes how this more 
general methodology and deeper understanding will fa- 
cilitate this process. 

2 QUESTIONS 

Each and every one of us asks questions, and has 
done so since being able to construct simple sentences. 
Questions are an essential mechanism by which we ob- 
tain information, however as we all know, some ques- 
tions axe better than others. Questions are not always 
verbal requests, but are often asked in the form of 
physical manipulations or experiments: ‘ What happens 
when I let go of my cup of milk?’’ or ‘ Will my mother 
make that face again if I drop it a second time ?’ Ques- 
tions may also be more fundamental, such as the sac- 
cade you make when you detect motion in your pe- 
ripheral visual field. Or perhaps the issue is more ef- 
fectively resolved by turning your head so as to deploy 


both your visual and auditory sensory modalities. Re- 
gardless of their form, “questions are requests for in- 
formation” (Caticha, 2004). 

Many questions simply cannot be asked: there may 
be no one who will know the answer, no immediate 
way to ask it, you may not be allowed for a variety 
of reasons, or the question may be too expensive with 
respect to some cost criteria. In most situations, these 
questions cannot now be asked directly: ‘Is there life 
in Europa’s ocean?', ‘How fast does the SR71 Black- 
bird fly?’, ‘How would radiation exposure on a Mars 
mission affect an astronaut’s health?', or ‘What is the 
neutrino flux emitted from Alpha Centauri?’ In these 
cases, one must resort to asking other questions that 
do not directly request the information sought, yet still 
have relevance to the unresolved issue. This sets up the 
iterative process of inquiry and inference, which is es- 
sential to the process of learning — be it active learning 
by a machine, learning performed by a child, or the act 
of doing science by the scientific community. 

Choosing relevant questions is a difficult task that re- 
quires intelligence. Anyone who has tried to perform a 
construction task with the assistance of a small child 
will appreciate this fact. Constantly being asked ‘Do 
you need a hammer?' by even the most enthusiastic 
helper can be a great annoyance when you are strug- 
gling to drill a hole. This is precisely the situation we 
will need to avoid when robots are used to assist us in 
difficult and dangerous construction tasks. Relevant 
questions asked by an intelligent assistant will be in- 
valuable to minimizing risks and maximizing produc- 
tivity in human-robot interactions. However, despite 
being an important activity on which we intelligent 
beings constantly rely, the mathematics of quantifying 
the relevance of a question to an outstanding issue has 
been surprisingly neglected. 

3 DEFINING QUESTIONS 

One of the most interesting facts about questions is 
that even though we don’t know the answer, the ques- 
tion is essentially useless if we have absolutely no idea 
of what the answer could be. That is, when questions 
are asked intelligently, we already have a notion of the 
set of possible answers that the resolution may take. 
Richard T. Cox in his last paper captured this idea 
when he defined a question as the set of all logical 
statements that answer it (Cox, 1979). 

The utility of such a definition becomes apparent when 
one considers the set of all possible answers to be a 
hypothesis space. The act of answering a question 
is equivalent to retrieving information, which will be 
used to further refine the probability density func- 
tion over the hypothesis space, thereby reducing un- 


certainty. This can be formalized to a greater de- 
gree, and to our advantage, by realizing that a set 
of logical statements can be partially-ordered by the 
binary ordering relation ‘implies'. This set of logical 
statements along with its binary ordering relation 
generically written in order-theoretic notation as <, 
forms a partially-ordered set, which can be shown to 
be a Boolean lattice (Birkhoff, 1967; Davey & Priest- 
ley, 2002). As a concrete example, consider a human- 
robotic cooperative construction task involving a robot 
named Bender and a human named Fry. 1 Bender has 
become aware that Fry will be in need of a tool, but 
must decide which tool Fry will prefer: 

d = ‘Fry needs a drill!’ 
w = ‘Fry needs a wrench!’ 
h = ‘Fry needs a hammer!’ 

These three atomic statements comprise the three mu- 
tually exclusive possibilities in Bender’s hypothesis 
space. The Boolean lattice A (Figure 1), which I will 
interchangeably call the statement lattice or the asser- 
tion lattice, is the powerset of these three statements, 
formed by considering all possible logical disjunctions, 
ordered by the binary ordering relation ‘implies’, — 
In an ideal situation, Bender’s situational awareness 
would provide sufficient information to allow him to 
infer the tool Fry most probably needs. However, in re- 
ality, this will not always be the case, and Bender may 
need more information to adequately resolve the infer- 
ence. The human way to accomplish this is to simply 
ask Fry for more information. Clearly, the most rele- 
vant question Bender can ask will depend both on the 
probabilities of the various hypotheses in this space, 
and on the specific issue Bender desires to resolve. 

We now introduce a more formal definition of a ques- 
tion, which will allow us to generate a lattice of ques- 
tions from a lattice of logical statements represent- 
ing the hypothesis space. We first define a down-set 
(Davey & Priestley, 2002). 

Definition 1 (Down-set) A down-set is a subset J 
of an ordered set L, written J = J ,L, where if a € J , 
x € L, x < a then x e J. 

Keep in mind that < represents the ordering relation 
for the ordered set — in this case < is equivalent to — > 
for the lattice A. A formalized version of Cox’s defini- 
tion of a question follows (Knuth, 2003a, 2004b, c). 

Definition 2 (Question) A question Q is defined as 
a down-set of logical statements Q = l{ai,a 2 > ■ - • a}- 
The question lattice Q is the set of down-sets of the 

lender and Pry are characters on the animated televi- 
sion series Futurama created by Matt Groening. 
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Figure 1 : A 3 is the Boolean lattice formed from three 
mutually exclusive assertions ordered by the relation 
‘ implies '. The bottom element X is the absurdity, 
which is always false, and the top element T is the 
truism, which is always true. 

assertion lattice A ordered by the usual set-inclusion 
C, so that Q = 0(A). 

This defines a question in terms of the set of statements 
that answer it, which includes all the statements that 
imply those statements. Note that I am using lower- 
case letters for assertions (logical statements), upper- 
case letters for questions (or sets), and script letters 
for ordered sets (lattices). The question lattice Q gen- 
erated from the Bender’s Boolean assertion lattice A 
is shown in Figure 2 with the following notation: 

H= lh = {fi,X} 

WH = lw V h = {w V h,w,h, X} 

DWH — ldVwV/i={lVwV/i,...} 

This lattice shows all the possible questions that one 
can ask concerning the hypothesis space A. For ex- 
ample, the question H U DW is the set union of the 
questions H and DW. HuDW represents the question 
‘'Do or do you not need a hammer?', since this question 
can be answered by the statements {d V w, d, w, h, X}, 
where d V w is equivalent to ‘Fry does not need a ham- 
mer. 1 ', since ~ h = d V w. 2 Note that not all of the 
questions in Q have English language equivalents. 

4 THE QUESTION ALGEBRA 

The ordered set Q is comprised of sets ordered by the 
usual set inclusion C. This ordering relation naturally 
implements the notion of answering. If a question A 
is defined by a set that is a subset of the answers to 
a second question B, so that A C B, then answering 
the question A will also answer the question B. Thus 

2 Note also that X is the absurd answer, which answers 
all questions since it implies everything (see Figure 1). 
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Figure 2: Q 3 is the free distributive lattice formed from 
the assertion lattice A 3 . Questions are ordered by set- 
inclusion which implements the relation ‘answers'. 


question A answers question B if and only if A C B. 
This allows us to read A C B as ‘A answers B’, and 
recognize that questions lower in the lattice (Figure 2) 
answer questions higher in the lattice. 

The fact that the ordered set Q is comprised of sets 
ordered by C implies that it is a distributive lattice 
(Knuth, 2003a, b, 2004a, b,c). This means that Q pos- 
sesses two binary algebraic operations, the join V and 
meet A, which are identified with set union U and set 
intersection fl, respectively (Knuth, 2003a). Just as 
the join and meet on the assertion lattice A can be 
identified with the logical disjunction V (OR) and the 
logical conjunction A (AND), the join and meet on the 
question lattice can also be viewed as a disjunction and 
a conjunction of questions, respectively. These opera- 
tions allow us to algebraically manipulate questions as 
easily as we currently manipulate logical statements. 

However, the similarities to the more specific Boolean 
algebra end there. Distributive algebras, in general, 
do not possess the Boolean unary operation of nega- 
tion. In the case of Q it is easy to see why, since the 
complement of a down-set is not a down-set. Thus, in 
general, questions do not possess complements. 


Table 1: The Question Algebra 


ORDERING 


Answers 

Reflexivity 

Antisymmetry 

Transitivity 

< = C 

For all A, A < A 

If A < B and B < A then A = B 

If A < B and B < C then A < C 

OPERATIONS 

Disjunction 

V = u 

Conjunction 

a = n 

Idempotency 

AVA = A 


A A A = A 

Commutativity 

AV B = B\/ A 


A A B = B A A 

Associativity 

Av(RvC) = (AvB)vC 


A A (B A C) = (A A B) A C 

Absorption 

A V (A A B) = A A (A V B) = A 

Distributivity 

A A (B V C) = (A A B) V (A A C) 


A V (B A C) = (A V B) A (A V C) 


CONSISTENCY 


A<B AAB = A AV B = B 


The join-irreducible elements of the question lattice 
J(Q) are the questions that cannot be written as the 
join (set union) of two other questions. These ques- 
tions are called the ideal questions (Knuth 2003a), de- 
noted 3 = 3(Q), reflecting the fact that they are the 
ideals (Birkhoff, 1967; Davey Sc Priestley, 2002) of the 
lattice Q. While ideal questions axe not very inter- 
esting questions to ask, they are useful mathematical 
constructs. The ideal questions form a lattice isomor- 
phic to the original assertion lattice A. Thus we have 
the correspondence where Q = 0(A) and A 3 ( 0 ). 
The lattices A and Q are said to be dual in the sense 
of Birkhoff ’s Representation Theorem (Knuth 2004a). 
Furthermore, the operation 0 takes lattice sums to 
lattice products, whereas 3 takes lattice products to 
lattice sums. These maps axe the order-theoretic ana- 
logues of the exponential and the logarithm. This will 
have important consequences when we generalize the 
question algebra to the inquiry calculus. 

H 





Figure 3: The hypergraph associated with the question 
H U DW = ‘Do you or do you not need a hammer?’’ 


There are other important types of questions. The 
first definition is originated from Cox (1979). 

Definition 3 (Real Question) A real question is a 
question Q € Q, which can always be answered by a 
true statement. The real sublattice is denoted by 31. 

It is straightforward to show that a real question enter- 
tains each of the mutually exclusive atomic statements 
of A as acceptable answers (Knuth 2003a). This leads 
to the following proposition, which I will leave for the 
reader to prove. 

Proposition 1 (The Least Real Question) For 

all Q e Q, Q e 31 iff Q > V .13(A). The question 
C = \J 13(A) = min 31 is the least real question. 

Thus the least element in the real sublattice 31 is the 
question formed from the join of the downsets of the 
mutually exclusive atomic statements of A. In our 
example, the least real question is D U W U H. This 
question is unique in that it answers all real questions 
in Q. 

Definition 4 (Central Issue) The central issue is 
the least element in the real sublattice 31 of the ques- 
tion lattice Q, denoted min 31. Answering the central 
issue resolves all the real questions in the lattice. 

Last, a partition question is a real question that neatly 
partitions its set of answers. Specifically, 

Definition 5 (Partition Question) A partition 
question is a real question P 6 31 formed from the 
join of a set of ideal questions P = V”=i where 
V Xj,Xk € 3(Q), Xj A Xk = A. when j / k. 

There are five partition questions in our example: 
DWH, H Li DW, WuDH, DUWH, and DuWuH. 
Together these questions form a lattice 3 3 isomorphic 
to the partition lattice II 3 . Note that the central issue 
is the partition question with the maximal number of 
partitions. For this reason, it is the least ambiguous 
question. 

The question lattice Q generated from the Boolean lat- 
tice A is known as the free distributive lattice (Knuth, 
2003a). As such, it is isomorphic to the lattice of sim- 
plicial complexes in geometry (Klain & Rota, 1997), 
as well as the lattice of hypergraphs (Knuth, 2004b). 
Thus hypergraphs are a convenient graphical means 
of diagramming questions. Figure 3 shows the hyper- 
graph associated with the question HuDW = ‘Do you 
or do you not need a hammer ?' Such hypergraphs may 
play a more significant role when inquiry is united with 
inference in the form of Bayes Nets. 



5 THE INQUIRY CALCULUS 

With the question algebra well-defined, we now extend 
the ordering relation to a quantity that describes the 
degree to which one question answers another. This 
is done by defining a bi- valuation on the lattice that 
takes two questions and returns a real number. We 
call this bi- valuation the relevance (Knuth, 2004b). 

Definition 6 (Relevance) The degree to which a 
question Q resolves an outstanding issue I, for all 
Q,I € Q, is called the relevance, and is written d(I'Q) 
where 

{ c if Q < I ( Q answers I) 

0 */ Q A I = _L (Q and I are exclusive) 

d otherwise, where 0 < d < c. 


Consistency with distributivity of A over V results in 

the Product Rule 


d(X AY\Q) = cd(X\Q)d(Y\X AQ), (3) 


where the real number c is again the maximal rele- 
vance. Note that the calculus cannot simultaneously 
support distributivity of A over V and distributivity 
of V over A, which axe both allowed in a distributive 
lattice (Knuth 2004a, b). It may surprise some to learn 
that is also the case in probability theory. 

Last, consistency with commutativity of A results in a 

Bayes’ Theorem Analogue 


d(Y \X A Q) = 


d{Y\Q)d(X\Y A Q) 
d(X\Q ) 


(4) 


with c being the maximal relevance. 

This bi- valuation is defined so as to extend the dual of 
the zeta function for the lattice, which acts to quan- 
tify order-theoretic inclusion, which in this case, in- 
dicates whether the question Q answers the question 
I (Knuth, 2004a, b). The utility of this bi- valuation 
becomes apparent when one considers I = min 3? to 
be the central issue, and Q € 31 to be an arbitrary 
real question. The bi-valuation d(I\Q) then quantifies 
the degree to which Q resolves the central issue I by 
taking a value d where 0 < d < c. 3 This is analogous 
to the notation in probability theory where p(x\y) de- 
scribes the degree to which the statement y implies the 
statement x (Cox, 1946, 1961; Jaynes, 2003). 

Consistency between the definition of relevance, and 
the lattice structure (or, equivalently, its algebra) re- 
sults in three rules, which describe how relevances re- 
late to one another (Knuth 2004a, b). Consistency with 
associativity gives rise to the Sum Rule : 

d(X V Y\Q) = d(X\Q) + d(Y\Q) - d(X A Y\Q), (1) 
and its multi-question generalization 


The fact that these three rules are shared between the 
inquiry calculus and probability theory is a result of 
the fact that both the assertion lattice A and the ques- 
tion lattice Q are distributive lattices, with a Boolean 
lattice being a special case of a distributive lattice. 

Since the assertion lattice A and the question lattice Q 
axe dual to one another in the sense of BirkhofTs Rep- 
resentation Theorem, it is not unreasonable to expect 
that the values of the relevances of questions must be 
consistent with the probabilities of their possible an- 
swers. Given an ideal question X = [x we require 
that 

d(X|T) = H(p{x |T)), (5) 

where cf(X|T) is the degree to which the question that 
asks everything T answers the question X, p(x|T) is 
the degree to which the truism (the top element of A) 
implies the statement x, and H is a function to be de- 
termined. The result, which relies on partition ques- 
tions (Knuth, 2004b) and an important result from 
Janos Aczel and colleagues (Aczel, Forte & Ng, 1974) 
is that the unique form of the function for a partition 
question P 6 T is a linear combination of the Shannon 
and Hartley entropies 


d(X 1 vX 2 v---vX n \Q) = 

Y d(Xi\Q) - Y d ( X i A *xl<2)+ 

i i<j 

Y d(Xi A Xj A Afc|Q) — • ■ • , (2) 

i<j<k 

which, due to the Mobius function of the distributive 
lattice, displays the familiar sum and difference pat- 
tern known as the inclusion- exclusion principle (Klain 
& Rota, 1997; Knuth 2004a, b). 

3 Robert Fry originally introduced the notation b(Q\I), 
which is equivalent to my notation d(I\Q). For more de- 
tails, consult (Knuth, 2004b) 


d(P |T) = a H m (p 1 ,p 2 , ■ ■ ■ ,p„)‘+ 

b • • ' ,Pn)i (6) 

where Pi = p(xt|T), a,b are arbitrary non-negative 
constants. The Shannon entropy (Shannon & Weaver, 
1949) is defined as 

n 

Hm(Pl,P2,- - ,Pn) = -Y Pi ^ 0g2Pi ’ 

i=l 

and the Hartley entropy (Hartley, 1928) is defined as 
0 Hm(pi,P2, ■ ■ ■ ,Pn) - log 2 N(P), (8) 


where N(P) is the number of non-zero arguments pi. 
By setting the arbitrary constants a = b = 1 in (6), 
and using (7), (8), we get 

n 

,Pn) = ~^Pil0g 2 j, (9) 

i=l n 

which is the relative entropy based on a uniform mea- 
sure. This result is important since it rules out the 
use of other entropies for the purpose of inference and 
inquiry. Any other entropy function will lead to an in- 
consistency between the bi-valuations defined on the 
assertion lattice A and the bi-valuations defined on the 
question lattice Q. 

6 A NATURAL GENERALIZATION 
OF INFORMATION THEORY 

We will now show that these results not only lead nat- 
urally to information theory, but significantly general- 
ize its scope including several generalizations already 
proposed in the literature. For simplicity, we will as- 
sign the arbitrary constants so that a = 1 and 6 = 0, 
and limit ourselves to the Shannon entropy. The main 
result of the previous section is that the degree to 
which the top question T answers any partition ques- 
tion P € T is quantified by the entropy of its answers. 
Thus probability quantifies what we know, whereas 
entropy quantifies what we do not know. 

However, more basic quantities also appear, and take 
on new fundamental importance. Since partition ques- 
tions are joins of ideal questions, it is straightforward 
to show, using the sum rule, that the degree to which 
T answers an ideal question Xi G 3 is given by the 
probability-weighted surprise 

d(Xi[T) = -pilog 2 Pi. (10) 

If we look at our earlier example, we can compute the 
degree to which T answers the question DW V WH. 
This is easily done using the sum rule, which gives 

d(DW V WH\T) = d{DW\T) + d(WH\T) 

— d(DW A WH\T). (11) 

Clearly this quantity is related to the mutual informa- 
tion between DW and WH, which when written in 
standard notation would look like 


illustration is important to note that (11) is not exactly 
a mutual information since neither DW nor WH are 
partition questions, however with a larger hypothesis 
space it is trivial to construct the mutual information 
in this way. 

By considering the disjunction and conjunction of 
multiple issues, one can construct relevances that 
are higher-order mutual informations and higher-order 
joint entropies that exhibit the sum and difference pat- 
terns in the multi-question generalization of the sum 
rule. Higher-order generalizations such as these were 
independently suggested by several authors (McGill, 
1955; Cox, 1961, 1979; Bell, 2003), however here one 
can see that they occur naturally as a result of the 
inquiry calculus. 

To consider the conjunction and disjunction of ques- 
tions to the right of the solidus, one must use the 
sum and product rules in conjunction with the Bayes’ 
theorem analogue to move questions from one side 
of the solidus to the other. The following exam- 
ple demonstrates a typical calculation, which also in- 
cludes some algebraic manipulation. Consider again 
Bender’s central issue T = 1 Which tool do you need V . 
However, Bender has asked this question 10 times 
in the last hour, and Fry is getting quite irri- 
tated and will lose his temper if he hears that 
question again. To find another question, Bender 
computes the relevance that the question Qh — 
‘Do you or do you not need a hammer V has on the is- 
sue. This calculation results in 


d(T\Q H )=d(T\Q H AT) 


= d(Qn\T A T) 


d(Qn\ T) 
d(T |T) 


= d{Q H \T) 


d{Q H IT) 
d(T\ T) 


d(Qn\T) 

C d(T\ T) ’ 


(13) 


where the result is simply a ratio of two entropies. 
Note that this formalism relies on relevances that are 
conditional — like probabilities. This notion is absent 
in traditional information theory, and is another way in 
which the inquiry calculus is a natural generalization. 


7 DISCUSSION 


I(DW; WH) = 

H(DW) + H{WH) - H(DW, WH), (12) 

with d(DW A WH}T) being related to the joint en- 
tropy. Thus mutual information is related to the dis- 
junction of two issues, whereas the joint entropy is re- 
lated to the conjunction of two issues. However, in this 


We have demonstrated that the question algebra and 
the inquiry calculus follow naturally from a straight- 
forward definition of a question as the set of state- 
ments that answer it. The question algebra enables 
one to manipulate questions algebraically as easily as 
we currently manipulate logical statements, whereas 
the inquiry calculus allows us to quantify the degree 



to which one question answers another. This method- 
ology promises to enable us to design machines that 
can identify maximally relevant questions in order to 
actively obtain information. This work has clear im- 
plications for areas of research that rely on question- 
asking, such as experimental design (Lindley, 1956; 
Loredo, 2004), search theory (Pierce, 1979), and ac- 
tive learning (MacKay, 1992), each of which has taken 
advantage of information theory during their histories. 
In addition, this approach has already shown promise 
in several applications by Robert Fry (1995, 2002). 

However, the inquiry calculus is more fundamental 
than information theory in the sense that it derives 
directly from the question algebra. The sole postulate 
is that the bi-valuations on the dual lattices are de- 
fined consistently. The result is that the Shannon and 
Hartley entropies are the only entropies that can be 
used for the purposes of inquiry — all other entropies 
will lead to inconsistencies. Entropy is related to the 
relevances involving the partition questions, mutual in- 
formation is related to disjunctions of questions, and 
joint entropy is related to conjunctions of questions. 
Higher-order informations occur naturally when multi- 
ple disjunctions and conjunctions are considered. Last, 
the calculus allows for, and relies on, conditional quan- 
tities not considered in traditional information theory. 
The result is an algebra and a calculus that takes the 
guesswork out of defining information-theoretic cost 
functions in applications involving question-asking. 

Our explorations into the realm of questions are only 
beginning, and it would be naive to think that the 
work presented here is the entire story. Recently, Ariel 
Caticha presented an alternative approach to viewing 
a question as a probability distribution, which is in 
some ways simultaneously more general yet more re- 
strictive than the approach presented here (Caticha, 
2004). The result is a measure of relevance described 
by relative entropy. It will be interesting to see where 
these new investigations lead. 

APPENDIX: POSETS AND 
LATTICES 

In this section I introduce some basic concepts of order 
theory that are necessary to understand the spaces of 
logical statements and questions. Order theory cap- 
tures the notion of ordering elements of a set. For a 
given set, one associates a binary ordering relation to 
form what is called a partially-ordered set, or a poset 
for short. This ordering relation, generically written 
<, satisfies reflexivity, antisymmetry, and transitivity. 
The ordering a < b is generally read ‘6 includes a\ 
When a < b and a ^ 6, we write a < b. Furthermore, 
if a < b, but there does not exist an element x in the 


set such that a < x < b, then we write a -< b, read L b 
covers a\ indicating that 6 is a direct successor to a in 
the hierarchy induced by the ordering relation. This 
concept of covering can be used to construct diagrams 
of a poset. If an element b includes an element a then 
it is drawn higher in the diagram. If b covers a then 
they are connected by a fine. 

A poset P possesses a greatest element if there exists 
an element T e P, called the top , where x < T for all 
x G P. Dually, a poset may possess a least element 
leP, called the bottom. The elements that cover the 
bottom are called atoms. 

Given two elements x and y, their upper bound is de- 
fined as the set of all z £ P such that x < z and y < z. 
If a unique least upper bound exists, it is called the 
join, written x V y. Dually, we can define the lower 
bound and the greatest lower bound, which if it exists, 
is called the meet, xAy. Graphically the join of two ele- 
ments cam be found by following the lines upward until 
they first converge on a single element. The meet can 
be found dually. Elements that cannot be expressed 
as a join of two elements belong to a special set of 
elements called join-irreducible elements. 

The dual of a poset P, written P 9 can be formed by re- 
versing the ordering relation, which can be visualized 
by flipping the poset diagram upside-down. This ac- 
tion exchanges joins and meets and is the reason that 
their relations come in pairs (see Table 1). 

A lattice L is a poset where the join and meet exist for 
every pair of elements. We can view the lattice from 
a structural viewpoint as a set of objects arranged by 
an ordering relation <. However, we can also view the 
lattice from an operational viewpoint as an algebra 
on the space of elements with the operations V and A 
along with any other relations induced by the ordering 
relation. The join and meet obey idempotency, com- 
mutativity, associativity, and the absorption property. 

The act of generalizing an algebra to a calculus goes 
back at least as far as 1946 when Richard Cox de- 
rived probability theory by requiring consistency with 
the Boolean algebra of logical statements (Cox 1946, 
1961). This led to the perspective where probabili- 
ties are viewed as degrees of implication and probabil- 
ity theory is viewed as an extension of logic (Jaynes, 
2003). Ariel Caticha, working with quantum mechan- 
ical experimental setups showed that a distributive al- 
gebra can lead to sum and product rules with com- 
plex valuations (Caticha, 1998). When combined with 
lattice theory, this leads to a very powerful means 
of deriving measures from ordering relations (Knuth, 
2004a). 



Acknowledgements 

This work was supported by the NASA IDU /IS/CICT 
Program and the NASA Aerospace Technology Enter- 
prise. I am deeply indebted to Ariel Caticha, Bob Fry, 
Janos Aczel, and Kevin Wheeler for insightful and in- 
spiring discussions. 

References 

Aczel J., Forte B. & Ng C.T. (1974). Why the Shannon 
and Hartley entropies are ‘natural’. Adv. Appl. Prob., 
Vol. 6, pp. 131-146. 

Bell A.J. (2003). The co-information lattice. Proceed- 
ings of the Fifth International Workshop on Indepen- 
dent Component Analysis and Blind Signal Separation: 
ICA 2003 (eds. S. Amari, A. Cichocki, S. Makino and 
N. Murat a). 

Birkhoff G.D. (1967). Lattice Theory, Provi- 
dence:American Mathematical Society. 

Boole G. (1854). An Investigation of the Laws of 
Thought. London:Macmillan. 

Caticha A. (1998). Consistency, amplitudes and prob- 
abilities in quantum theory. Phys. Rev. A, Vol. 57, 
pp. 1572-1582. 

Caticha A. (2004). Questions, relevance and rela- 
tive entropy. In press: Bayesian Inference and Max- 
imum Entropy Methods in Science and Engineering, 
Garching, Germany, August 2004 (eds. R- Fischer, R. 
Preuss, U. Von Toussaint, V. Dose). AIP Conf. Proc., 
Melville NY:AIP. 

Cox R.T. (1946). Probability, frequency, and reason- 
able expectation. Am. J. Physics, Vol. 14, pp. 1-13. 

Cox R.T. (1961). The algebra of probable inference. 
Baltimore:Johns Hopkins Press. 

Cox R.T. (1979). Of inference and inquiry. In The 
Maximum Entropy Formalism (eds. R. D. Levine & 
M. Tribus). Cambridge:MIT Press, pp. 119-167. 

Davey B.A. & Priestley H.A. (2002). Introduction 
to Lattices and Order. Cambridge:Cambridge Univ. 
Press. 

Fry R.L. (1995). Observer-participant models of neu- 
ral processing. IEEE Trans. Neural Networks, Vol. 6, 
pp. 918-928. 

Fry R.L. (2002). The engineering of cybernetic sys- 
tems. In Bayesian Inference and Maximum Entropy 
Methods in Science and Engineering, Baltimore MD, 
USA, August 2001 (ed. R. L. Fry). New York:AIP, 
pp. 497-528. 

Hartley R.V. (1928). Transmission of information. 


Bell System Tech. J., Vol. 7, pp. 535-563. 

Jaynes E.T. (2003). Probability theory: the logic of 
science. CambridgeiCambridge Univ. Press. 

Klain D.A. & Rota G.-C. (1997). Introduction to 
geometric probability. Cambridge:Cambridge Univ. 
Press. 

Knuth K.H. (2003a). What is a question? In Bayesian 
Inference and Maximum Entropy Methods in Science 
and Engineering, Moscow ID, USA, August 2002 (ed. 
C. Williams). AIP Conf. Proc. Vol. 659, Melville 
NY : AIP, pp. 227-242. 

Knuth K.H. (2003b). Intelligent machines in the 21st 
century: foundations of inference and inquiry, Phil. 
Trans. Roy. Soc. Lond. A, Vol. 361, No. 1813, pp. 
2859-2873. 

Knuth K.H. (2004a). Deriving laws from ordering re- 
lations. In Bayesian Inference and Maximum Entropy 
Methods in Science and Engineering, Jackson Hole 
WY, USA, August 2003 (ed. G. J. Erickson). AIP 
Conf. Proc. Vol. 707, Melville NY:AIP, pp. 204-235. 

Knuth K.H. (2004b). Lattice duality: The origin of 
probability and entropy. In press: Neurocomputing. 

Knuth K.H. (2004c). Measuring questions: Relevance 
and its relation to entropy. In press: Bayesian Infer- 
ence and Maximum Entropy Methods in Science and 
Engineering, Garching, Germany, August 2004 (eds. 
R. Fischer, R. Preuss, U. Von Toussaint, V. Dose). 
AIP Conf. Proc., Melville NY: AIP. 

Lindley D.V. (1956). On the measure of information 
provided by an experiment. Ann. Math. Statist. Vol. 
27, pp. 986-1005. 

Loredo T.J. (2004). Bayesian adaptive exploration. 
In: Bayesian Inference and Maximum Entropy Meth- 
ods in Science and Engineering, Jackson Hole WY, 
USA, August 2003 (ed. G. J. Erickson). AIP Conf. 
Proc. Vol. 707, Melville NY:AIP, pp. 330-346. 

MacKay D.J.C. (1992). Information-based objective 
functions for active data selection. Neural Computa- 
tion Vol. 4 No. 4, pp. 589-603. 

McGill W.J. (1955). Multivariate information trans- 
mission. IEEE Trans Inf o Theory, Vol. 4, pp. 93-111. 

Pierce J.G. (1979). A new look at the relation be- 
tween information theory and search theory. In The 
Maximum Entropy Formalism (eds. R. D. Levine & 
M. Tribus), Cambridge:MIT Press, pp. 339-402. 

Shannon C.E. & Weaver W. (1949). A mathematical 
theory of communication. Chicago:Univ. of 111. Press. 



